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Unlike 3C protease, the severe acute respiratory syndrome coronavirus (SARS-CoV) 3C-like protease 
(3CLpro) is only enzymatically active as a homodimer and its catalysis is under extensive regulation by the 
unique extra domain. Despite intense studies, two puzzles still remain: (i) how the dimer-monomer switch is 
controlled and (ii) why dimerization is absolutely required for catalysis. Here we report the monomeric crystal 
structure of the SARS-CoV 3CLpro mutant R298A at a resolution of 1.75 A. Detailed analysis reveals that 
Arg298 serves as a key component for maintaining dimerization, and consequently, its mutation will trigger a 
cooperative switch from a dimer to a monomer. The monomeric enzyme is irreversibly inactivated because its 
catalytic machinery is frozen in the collapsed state, characteristic of the formation of a short 3,,)-helix from an 
active-site loop. Remarkably, dimerization appears to be coupled to catalysis in 3CLpro through the use of 
overlapped residues for two networks, one for dimerization and another for the catalysis. 


Severe acute respiratory syndrome (SARS) is the first 
emerging infectious disease of the 21st century and is caused by 
a novel coronavirus termed SARS-CoV. It suddenly broke out 
in 2002 and then rapidly spread to 32 countries, causing ~8,500 
infections and over 900 deaths (http://www.who.int/csr/sars 
/en/). Despite being contained by the summer of 2003, several 
infections were reported later, a warning that SARS may re- 
turn. So far, neither a vaccine nor an efficacious therapy has 
been available. Therefore, it is urgently necessary to design the 
potential therapeutic agents against SARS. 

The SARS-CoV 3C-like protease (3CLpro), also known as 
the main protease (Mpro), plays a vital role in processing two 
viral polyproteins, ppla (486 kDa) and pplab (790 kDa), into 
active subunits required for genome replication and transcrip- 
tion. As a consequence, it has been validated to be a key target 
for development of antiviral therapies (3). Although the CoV 
3CLpro is so named to reflect the similarity of its active site to 
that of the picornavirus 3C protease, 3CLpro acquires an extra 
C-terminal domain in addition to the chymotrypsin fold 
adopted by the 3C protease to host the complete catalytic 
machinery (1, 2, 31). Intriguingly, unlike 3C protease, only the 
homodimeric form of the SARS-CoV 3CLpro has been char- 
acterized to be catalytically active (10, 25, 26) and the unique 
extra domain was found to play a key role in controlling dimer- 
monomer equilibrium (7, 11, 14, 15, 25, 26). However, despite 
intense studies of the SARS-CoV 3CLpro (2, 6, 7, 10, 11, 
13-15, 19, 20, 23, 25-27, 28, 29, 31, 32), two puzzles still remain 
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unsolved: (i) how the dimer-monomer switch is controlled and 
(ii) why dimerization is absolutely essential for catalysis. To 
best address these two issues, a high-resolution monomeric 
structure is highly demanded. 

The SARS-CoV 3CLpro assembles into a homodimer with 
an interface area of over 1,000 A2, and dissociation constants 
have been estimated to range from ~nM to pM by different 
groups (7, 10, 11, 13-16, 19, 20, 27, 28, 31). Previously we have 
conducted a systematic mapping of the interfacial residues 
critical for dimerization, and the results led to a proposal that 
the residues critical for maintaining dimerization and regulat- 
ing catalysis are organized as a nano-scale-channel-like net- 
work (25). Noticeably, a recent study suggests that the residues 
for controlling dimerization of the SARS-CoV 3CLpro might 
not be limited just to the interfacial ones, because Serl47 
located away from the interface was also identified to be cru- 
cial for dimerization (4). In our previous study, the residue 
Arg298, located at the end of the extra domain, was also 
uncovered to be exceptionally important for both dimerization 
and enzymatic activity. Replacement of Arg298 by Ala sud- 
denly rendered the enzyme into an inactive and monomeric 
form (25). However, due to the absence of a high-resolution 
three-dimensional structure of the monomeric enzyme, it is 
difficult to rationalize how a point mutation is capable of abol- 
ishing the SARS-CoV 3CLpro dimer with a relatively large 
interface and high affinity. Moreover, it is completely unclear 
how the enzyme rearranges its structure in response to the loss 
of dimerization as well as what structural changes are account- 
able for inactivating the catalytic machinery in the monomeric 
enzyme. 

In the present study, we succeeded in determining the crys- 
tallographic structure of the monomeric R298A mutant at a 
resolution of 1.75 A. Detailed analysis of this structure in 
conjunction with previously determined dimeric forms offers 
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key insights into why the extra domain is able to mediate the 
dimerization as well as how the dimerization is coupled to the 
catalysis of the SARS-CoV 3CLpro. 


MATERIALS AND METHODS 


Generation of the recombinant R298A mutant. Recently the extra N-terminal 
residues left over from the fusion proteins were found to induce the collapsed 
conformation of the active site; mutagenesis was thus performed on our previous 
plasmid (25) to convert the thrombin cleavage site (CTG GTT CCG CGT GGA 
TCC [LVPRGS]) to a factor Xa cleavage site (GAA GGT CGT [IEGR]) by 
using two primers, 5’-GGC GAC CAT CCT CCA AAA TCG GAT CTG ATC 
GAA GGT CGT AGT GGT TTT AGG AAA ATG GCA TTC CCG-3’ (for- 
ward) and 5'-CGG GAA TGC CAT TTT CCT AAA ACC ACT ACG ACC TTC 
GAT CAG ATC CGA TIT TGG AGG ATG GTC GCC-3’ (reverse). The 
GST-fused R298A protease was expressed in Escherichia coli strain BL21(DE3) 
with induction by 0.4 mM isopropyl-1-thio-p-galactopyranoside. The R298A 
protease was obtained by in-gel cleavage with factor Xa of the GST-R298A 
protein attached to glutathione Sepharose beads (Amersham Biosciences), fol- 
lowed by a fast protein liquid chromatography purification on a gel filtration 
column (HiLoad 16/60 Superdex 200). With the factor Xa cleavage, except for 
the R298A mutation site, the R298A protease studied here has a sequence 
identical to that of the authentic SARS-CoV 3CLpro whose structure was newly 
reported (Protein Data Bank code 2H2Z) (29). The molecular weight of the 
R298A protein was determined by a Voyager STR MALDI-TOF mass spectrom- 
eter (Applied Biosystems) using a single protease crystal (data not shown). 

Analytical ultracentrifuge. Sedimentation velocity experiments were done using 
a Beckman Coulter XL-I analytical ultracentrifuge. Briefly, the R298A protease 
was dialyzed against 10 mM Tris-HCl (pH 7.5) containing 14.4 mM B-mercap- 
toethanol and subsequently concentrated to 3 mg/ml for preparing samples at 
various protease concentrations. For a sedimentation velocity experiment, sam- 
ple (400 wl) and reference (440 pl) solutions were loaded into standard double- 
sector centerpieces and mounted in a Beckman An-50 Ti rotor. Experiments 
were conducted at 20°C with a rotor speed of 42,000 rpm. The UV absorbance 
of the samples was monitored in a continuous mode without delay, and a step 
size of 0.003 cm was used without averaging. To avoid imprecise UV absorbance 
readings at 280 nm for samples at high concentrations, a wavelength of 290 nm 
was used to monitor the velocity experiments at protein concentrations of 2 and 
3 mg/ml, while 280 nm was used for those at lower concentrations (0.2, 0.5, 1.0, 
and 1.5 mg/ml). Multiple scans at different time points were fitted into a con- 
tinuous size distribution using SEDFIT 9.4c (24). A partial specific volume of 
0.7311 cm?/g was calculated from the amino acid sequence of the R298A pro- 
tease; the solvent density and viscosity were also calculated by use of 
SEDNTERP (http://www. jphilo.mailway.com/). 

Crystallization, structure determination, and analysis of the R298A protease. 
The R298A protease with a concentration of 10 mg/ml was crystallized in a 2-l 
hanging drop using a condition slightly modified from that previously reported 
(29), namely 6% polyethylene glycol 6000, 2% (+)-2-methyl-2,4-pentanediol, 5% 
isopropanol, and 0.1 M morpholineethanesulfonic acid at pH 6.0. A single 
plate-like crystal (0.4 mm by 0.1 mm by 0.005 mm) was picked up from the crystal 
clusters for diffraction after a 3-day growth period, and then the coverslip was 
moved to a new well with buffer (0.1 M morpholineethanesulfonic acid, 10% 
polyethylene glycol 6000, pH 6.0) for 10 more days. The X-ray diffraction data 
were collected at beam line X29C, National Synchrotron Light Source, 
Brookhaven National Laboratory, using a Q315 charge-coupled-device detector 
(Area Detector Systems Corp., Poway, CA) at a wavelength of 1.0810 A. 

The obtained data set was processed using the program HKL2000 (22). Ini- 
tially, an attempt to solve the structure using a single protomer from the native 
3CLpro structure (Protein Data Bank code 1UJ1) as a search model by molec- 
ular replacement was unsuccessful. As such, we have subsequently treated the 
catalytic fold (residues 12 to 180) and the extra domain (residues 200 to 300) as 
two independent search models in the program Phaser (21) by assuming that 
these two domains are in a different disposition compared to the native 3CLpro 
structure, which readily gave the structure solution. Further, model building was 
done using the program COOT (9) and was followed by refinement using the 
program CNS (5). Finally, 457 well-defined water molecules were added, and 
refinement was continued until the R value converged to 0.168 (Reece = 0.208) for 
all reflections up to a 1.75-A resolution. The final model contained all 306 
residues of the mutant R298A protease and had good stereochemistry, with all 
residues falling within the allowed regions of the Ramachandran plot except for 
Tyr154, which was located at a tight turn as analyzed by PROCHECK (18). 
Details of the data collection and refinement statistics are presented in Table 1. 
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TABLE 1. Crystallographic data and refinement statistics for the 
mutant R298A 


Parameter 


Value(s)? 


Data collection | 
Wavelength (A). sicscesccssseccsssciesseseesnetecsseessserecsssies 1.0810 
Resolution limit (A) .. 
Space group 


Cell parameters 
a (A) 
b (A) 
c (A). 
BO. 
No. of observed reflections 
No. of unique reflections.. 
Completeness (%) 
Redundancy 
Linear R-factor.. 
Overall I/(1) ......c000 


Refinement . 
Resolution range (A). 


No. of reflections........ 
RMSD* bond lengths (A) 
RMSD* bond angles (°).... 


Ramachandran plot 
Favored (%) 
Allowed (%) 
Generously allowed (%) 
Disallowed (%) 


“ Only one residue, Tyr154, is in the disallowed region. This residue is located 
at the tight turn region, which leads to the abnormal dihedral angle. 

’ Values in parentheses are for the highest-resolution shell. 

© Ryork = =NFows — Featel/ZFovs: Where Feai¢ and Fy,; are the calculated and 
observed structure factor amplitudes, respectively. 

4 Rye Was calculated in the same manner as R,,o,x, but for 6% of the total 
reflections chosen at random and omitted from refinement. 

° RMSD, root mean square deviation. 


The structure overlay was done by LSQKAB and CCP4mg from the CCP4 
suite (8). The extra domain motion analysis was determined by DynDom 1.5 (12). 
The volume of the active pocket was calculated using SURFNET (17), and all the 
figures were prepared using the PyMOL molecular graphics system (W. L. 
DeLano, DeLano Scientific LLC, San Carlos, CA). 

Protein Data Bank accession number. The atomic coordinates have been 
deposited in the Protein Data Bank with the code of 2QCY. 


RESULTS AND DISCUSSION 


Crystallization and structure determination. In the present 
study, we have attempted to crystallize all monomeric mutants 
that we previously documented (25), and subsequently this has 
led to the acquisition of high-quality crystals for the R298A 
mutant. This success thus allowed us to determine the mono- 
meric crystal structure of the SARS-CoV 3CLpro in the space 
group P2, at a resolution of 1.75 A (Table 1) with one molecule 
per asymmetry unit. The new construct of the R298A mutant 
with the authentic sequence was also found to be completely 
inactive even at high protein concentrations. On the other 
hand, the present observation that the R298A mutant exists as 
a monomer in crystal is consistent with the results that we 
previously obtained in solution by gel filtration chromatogra- 
phy and dynamic light scattering (25), as well as by analytical 
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Packing of SARS-CoV R298A 
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mutant in P2, space goup 


(a) View down along a axis 


(c) R298A mutant 
Space group: P2, 


(f) PDB: 2C3S 
Space group: P2,2,2 


(d) PDB: 2H2Z 
Space group: C2 


(g) PDB: 2GT7 
Space group: P2, 


(b) View down along ¢ axis 


(e) PDB: 2GT8 
Space group: P4,2,2 


(h) PDB: 2A5A 
Space group: C2 


FIG. 1. Crystal packing of the R298A mutant and comparison with other native SARS-CoV 3CLpro structures. Crystal packing at plane be (a) 
and at plane ab (b). In a unit cell, the arrangement of the molecules of the R298A mutant in space group P2, (c) is different from the previously 
reported native SARS-CoV 3CLpro structures in space group C2 (d and h), P432,2 (e), P2,2,2 (f), and P2,(g). For the native protease structures, 
two molecules form a biological dimer in the unit cell. One molecule is magenta, while the other molecule in crystallography twofold symmetry 


is cyan. 


ultracentrifuge (see below). The R298A crystal structure 
showed that no biological dimer unit like that of the wild-type 
3CLpro could be formed in the unit cell even by considering 
the crystallography twofold symmetry between the adjacent 
asymmetric units (Fig. 1). Moreover, the present crystal pack- 
ing has never been found in any previously reported structures 
of the SARS-CoV 3CLpro crystallized in various space groups, 
including P2,. 

Analytical ultracentrifuge characterization. To further char- 
acterize the actual status of the R298A mutant in solution, 
analytical ultracentrifuge experiments were carried out to mea- 
sure sedimentation coefficients at six different protein concen- 
trations. Analysis of the sedimentation velocity data collected 
at all these concentrations gave rise to peaks at ~2.8S by use of 


the continuous sedimentation coefficient distribution [c(s)] 
model (Fig. 2). Moreover, based on the continuous molar mass 
distribution [c(M)] model, the molecular masses of the R298A 
mutant at different concentrations were determined to range 
from 31.1 to 33.7 kDa. Data fitting both the monomer and 
monomer-dimer equilibrium models were tested, but only the 
monomer model was found to be valid. Most important, even 
at a protease concentration of up to 3 mg/ml (89.6 M), no 
peak could be detected at ~4S, which was expected for the 
wild-type 3CLpro dimer. The results clearly indicate that the 
R298A mutant exists as a complete monomer in solution even 
at a concentration of 3 mg/ml, completely in agreement with 
the crystallographic result of the R298A mutant remaining as 
a monomer in the saturation crystallization condition with a 
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FIG. 2. Sedimentation velocity ultracentrifugation of the R298A mutant. The sedimentation experiments were carried out with a Beckman 
Coulter XL-I analytical ultracentrifuge at 20°C and 42,000 rpm at protease concentrations from 0.2 to 3 mg/ml. A wavelength of 290 nm was used 
to monitor the velocity experiments at protein concentrations of 2 and 3 mg/ml, while 280 nm was used for those at lower concentrations (0.2, 0.5, 
1.0, and 1.5 mg/ml). (a) Sedimentation velocity absorbance trace of the R298A mutant (1 mg/ml) at 280 nm. OD, optical density. (b) Residuals 
of the experimental fit of the R298A mutant at 1 mg/ml (29.8 wM). (c) Continuous sedimentation coefficiency (S) distributions [c(s)] at six different 


protease concentrations (from 0.2 to 3 mg/ml). 


protein concentration at least higher than that of the initial 
solution used for crystallization (10 mg/ml). 

Structural comparison. Remarkably, the R298A mutant still 
adopts the characteristic architecture common to all coronavi- 
rus 3CLpros, with all residues (1 to 306) well defined in the 
electron density maps, as exemplified in Fig. 3A. However, as 
seen in Fig. 3B, due to the large conformational changes and a 
40-degree reorientation between the catalytic fold (residues 12 
to 180), holding the entire catalytic dyad and a major portion 
of the substrate binding pocket, and the extra domain (residues 
200 to 306) in the R298A structure, it is impossible to super- 
impose its full structures with other dimeric 3CLpro structures. 
However, a domain-based superposition with a fully active 
3CLpro structure that has the authentic sequence and a pH of 
6.0 (Protein Data Bank code of 2H2Z) (29) reveals that three 
regions undergo radical conformational changes in the mono- 
meric R298A structure: namely the N and C termini, as well as 
the loop (181 to 199) connecting the catalytic fold and extra 
domain (Fig. 3). On the other hand, a comparison of the 
tertiary contact maps of the mutant R298A and the active 
enzyme (Protein Data Bank code 2H2Z) structures shows that 
almost no alternation of the packing pattern occurs within the 
catalytic fold and extra domain (data not shown). Surprisingly, 


although the mutation R298A is located on the extra domain 
(200 to 300), this domain has a backbone root mean square 
deviation of 0.41 A, much lower than 1.08 A for the catalytic 
fold (12 to 190). It is particularly worthwhile to note that even 
the side-chain orientations of Arg298 in 2H2Z and Ala298 in 
R298A are highly similar (Fig. 3C). Examination of the resi- 
due-specific root mean square deviations indicates that while 
no large change is detected within the extra domain, relatively 
large variations can be recognized within the catalytic fold, in 
particular over several loops constituting the catalytic machin- 
ery, including residues 117 to 125, 133 to 144, and 166 to 169 
(data not shown). Very interestingly, these segments are close 
to the dimerization interface in the dimeric structure (Fig. 4A). 

How does the R298A mutation trigger a dimer-to-monomer 
switch? These findings open up an interesting question as to 
how the R298A mutation can dramatically eliminate dimeriza- 
tion without deconstructing the structural architecture and ter- 
tiary packing. Previous characterization suggests that the 
dimerization interface consists of residues from both the cat- 
alytic fold and the extra domain of the SARS-CoV 3CLpro. 
Both the N and C termini have been found to play a key role 
in maintaining the dimeric structure (7, 14, 15, 25, 26, 28). In 
particular, as shown in Fig. 4B, the N terminus (also termed 
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FIG. 3. Crystal structure of the monomeric R298A mutant. (a) Simulated-annealing Fo-Fc map (contoured at 3.5 sigma) for part of the 
oxyanion loop (138 to 145) in the R298A structure. All residues within 3.5 A of the Calpha atom of Phe140 were omitted prior to refinement. (b) 
Complete structures of the mutant R298A (red) and a native SARS 3CLpro (blue) (Protein Data Bank code of 2H2Z) superimposed over the 
catalytic fold (12 to 180). (c) Comparison of the extra domain (200 to 300) of the two structures. The side chains of the mutation site are displayed 
as sticks. C, C terminus; N, N terminus. (d). Comparison of the catalytic fold (200 to 300) with the N terminus and the connecting loop (181 to 


199) displayed. 


the N finger) not only extends beyond the catalytic fold to pack 
with residues on the extra domain of the same protomer but 
also forms extensive interactions with residues of the opposite 
protomer. Of these interprotomer interactions, two have been 


documented to significantly stabilize the dimeric structure, 
namely a salt bridge between the side chains of Arg4 and 
Glu290 of the opposite protomer (7) and a hydrophobic-aro- 
matic interaction between the side chains of Met6 and Tyr126 


(B) C145 


(B) $137 

™ (B) F140 
(B) Y126 

(A) R4 


(A) E290 } (A) Y126 (B) £290 


(A) $137 


(A) C145 


FIG. 4. Structural features associated with dimerization. (a) A stereoview of significantly perturbed residues within the region (1 to 200) of the mutant 
R298A structure. These residues are mapped to one protomer (brown ribbon) of the dimeric structure (Protein Data Bank code 2H2Z) and displayed 
as green dots. Another protomer is shown as the violet ribbon and surface. The active-site residues His41 and Cys145 are shown as red and yellow spheres, 
respectively. (b). A stereoview of the dimeric structure (Protein Data Bank code 2H2Z) with one protomer in ribbon and another in surface modes. The 
mutation residue R298 is highlighted as red spheres. The N terminus of protomer A is shown as green sticks, while the C terminus is shown as brown 
sticks. The part of the oxyanion loop Ser139-Phe140-Leu141 which is converted into a short 3,9-helix in the mutant R298A structure is labeled and shown 
in cyan. The active-site residue His41 is pink, and Cys145 is yellow. (c) A stereoview of an interaction network responsible for maintaining the dimeric 
structure (Protein Data Bank code 2H2Z), with one protomer in purple and another in cyan. Hydrogen bonds are indicated by cyan dashed lines, 
hydrophobic interactions by red dashed lines, and salt bridges by blue dashed lines. The active-site Cys145 residues of both protomers are displayed as 
spheres. Interestingly, it appears that Tyr126 is utilized for maintaining dimerization via an aromatic-hydrophobic interaction with Met6 of the opposite 
protomer, as well as for stabilizing the catalytic machinery via an aromatic stack interaction with Phe140 of the same protomer. 
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FIG. 5. Structural characteristics of the R298A active site. (a) A stereoview of the substrate-binding pockets of the structures of the mutant 
R298A (red) and the native enzyme (Protein Data Bank code 2H2Z) (blue). Cyan sticks represent the two N-terminal residues Ser1-Gly2 of the 
opposite protomer in the dimeric structure (Protein Data Bank code 2H2Z), which is completely lacking in the monomeric R298A structure. The 
characteristic 3,)-helix found in the mutant R298A structure is shown in ribbon mode and labeled. Spheres are used to indicate a buried water 
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of the opposite protomer (28) (Fig. 4C). Interestingly, muta- 
tion of either Met6, Glu290, or Arg298 has been demonstrated 
to trigger the dimer-to-monomer switch of the enzyme (7, 25, 
28). On the other hand, within the same protomer, a hydrogen 
bond is formed between the side chain NH, of Arg298 and the 
backbone oxygen of Met6. Furthermore, as seen in Fig. 4B, 
Arg298 is located at the end of the extra domain and folds back 
to interact with the catalytic fold. As such, it is likely that the 
residue Arg298 serves as a key component in maintaining 
dimerization by integrating the N and extra domain of the 
same protomer together and in this way tremendously stabi- 
lizes the precise positioning and orientation of the N finger and 
extra domain, which are absolutely essential for the dimer 
formation. Also very strikingly, it appears that the intrapro- 
tomer interaction between R298A and Met6 can be further 
connected to the catalytic machinery of the opposite protomer 
through a two-step relay: an interprotomer hydrophobic-aro- 
matic interaction between the side chains of Met6 and Tyr126 
followed by an intraprotomer aromatic side chain interaction 
of Tyr126 to Phe140, an essential component of substrate- 
binding pocket (Fig. 4C). The phenyl ring of Phe140 interacts 
with the imidazole ring of His163 at the bottom of the S1 
specificity pocket, guaranteeing that this histidine remains un- 
charged over a wide range of pH values. Indeed, likely due to 
the loss of dimerization, the aromatic ring of Tyr126 undergoes 
a significant reorientation in the monomeric R298A structure. 
Nevertheless, the interactions shown in Fig. 4C may represent 
only part of a large network mediating dimerization, and other 
residues such as Ser147 are further involved in modulating 
dimerization via a long-range effect (4). 

Since the R298A mutation will release the constrained N 
finger and C terminus and then switch from a dimer to a 
monomer, it is not surprising to see large rearrangements over 
the N and C termini, as well as a 40-degree rotation between 
the catalytic fold and extra domain in the monomeric R298A 
structure. Based on the present results, we propose here that 
although the dimerization interface may be relatively large and 
its affinity is high, the network mediating dimerization is orga- 
nized in a way which contains many key components. Disrup- 
tion of any of them may result in a switch from a dimer to a 
monomer. This device may offer a significant advantage to 
facilitating the enzyme to act by following an association-acti- 
vation-catalysis-dissociation catalytic cycle as previously pro- 
posed (6, 13) and/or under extensive regulation by unknown in 
vivo binding partners (25, 26). On the other hand, this also 
implies that it would be relatively easier to obtain dimerization 
inhibitors for designing highly specific bifunctional inhibitors 
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for the SARS-CoV 3CLpro by linking with the active-site in- 
hibitors in the future (25). 

Why is dimerization absolutely essential for catalysis? The 
CoV 3CLpro shares a similar catalytic machinery with the 
picornavirus 3C protease, accomplishing catalytic tasks 
through a catalysis mechanism similar to chymotrypsin (1, 2, 
31). However, instead of having the catalytic triad found in 
chymotrypsin, they possess only a catalytic dyad, composed of 
the residues His41 and Cys145, with a buried water molecule at 
the position normally occupied by the side chain of Asp, the 
third member of the catalytic triad (19, 20, 23, 27, 31, 32). This 
water molecule forming three hydrogen bonds with His41, 
His164, and Asp187 is proposed to be a key component of the 
catalytic machinery. On the other hand, in the 3C protease, the 
extra domain is completely absent, and also there is no evi- 
dence suggesting that dimerization is required to activate its 
catalysis. Therefore, it is of fundamental interest to understand 
how in the SARS-CoV 3CLpro, dimerization suddenly be- 
comes coupled to catalysis. 

Surprisingly, although the mutation R298A is located at the 
end of the extra domain, in the monomeric R298A structure, 
extensive structural variations are observed in the catalytic 
fold, in particular, the loops constituting the catalytic machin- 
ery. This observation is consistent with the previous proposal 
that the active-site loops of the SARS-CoV 3CLpro are very 
dynamic and thus sensitive to the variation of environmental 
conditions (19, 27, 29-31, 32). As seen in Fig. 5A, many resi- 
dues made up of the catalytic machinery undergo significant 
changes in the monomeric R298A structure. More specifically, 
the loop residues over positions 138 to 144 have much more 
profound changes than other regions, such as the buried water 
molecule and its hydrogen-bonded His41, His164, and Asp187. 
This is reasonable because the loop residues, Gly138-Ser139- 
Phe140-Leu141, themselves directly contact the surface of the 
opposite protomer. Similarly, the antiparallel 8 sheet over 
residues 111 to 129 surrounding the catalytic machinery also 
has large rearrangements because many residues, such as 
Tyr126 on this B sheet, also have direct contacts with the other 
protomer. Nevertheless, as seen in Fig. 5A, the most distin- 
guishable characteristic in the mutant R298A catalytic machin- 
ery is the formation of a short 3,9-helix by residues Ser139- 
Phe140-Leu141, which instead adopt a loop conformation in 
the active enzyme structure (Protein Data Bank code 2H2Z). 
Most importantly, this chameleon conversion abolishes the key 
stack interaction between the rings of Phe140 and His163, as 
well as significantly twists the conformation of the residues 
Gly143-Ser144-Cys145, thus leading to the collapsed substrate- 


molecule at the position normally occupied by the side chain of an acidic residue, the third member of the catalytic triad. Interestingly, this water 
molecule, together with its hydrogen-bonded residues His41, His164, and Asp187, are highly similar in both the mutant R298A and native enzyme 
(Protein Data Bank code 2H2Z) structures. (b) A stereoview of the conformations over the residues Ile136-Lys137-Gly138-Ser139-Phe140-Leu141- 
Asn142-Gly143-Ser144-Cys145 in different crystal structures. Red, catalytically inactive R298A; green, catalytically inactive protomer of 1UJ1; 
yellow, catalytically inactive 2BX4; violet, catalytically inactive 2BX3; cyan, catalytically inactive 2GT8; blue, catalytically active 2H2Z. Active-site 
cavities mapped to the mutant R298A structure (c), the fully active native enzyme (Protein Data Bank code 2H2Z) (d), and two different protomers 
of a partly active native enzyme (Protein Data Bank code 1UJ1) (e). Previously, one protomer of the structure (Protein Data Bank code 1UJ1) 
was characterized to be catalytically active, while another was inactive due to the collapsed substrate-binding pocket and oxyanion hole (4). The 
characteristic formation of the short 3, -helix can be found in the monomeric and inactive R298A structure as well as in the collapsed protomer 
but not in the active protomer of a dimeric structure (Protein Data Bank code 1UJ1). 
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binding pocket and oxyanion hole. This amazing observation 
inspired us to examine all available crystal structures of the 
SARS-CoV 3CLpro. To our great surprise, the results indicate 
that in the dimeric structures, this characteristic is absent in the 
active protomers but can be found in the collapsed protomers. 
For example, in the dimeric structure (Protein Data Bank code 
1UJ1), which possesses a dimeric structure with two different 
protomers, one catalytically active and another collapsed (31), 
the conformation assumed by residues 136 to 146 in the active 
protomer is highly similar to that of the fully active enzyme 
(Protein Data Bank code 2H2Z), while the conformation in 
the collapsed protomer highly resembles that of the R298A 
structure (Fig. 5B). On the other hand, it was also recently 
shown that the collapsed protomer in the 1UJ1 structure could 
be successfully converted to the catalytically active one by 
removing the extra N-terminal residues left over from cleavage 
of the affinity tag (29). This strongly implies that although it is 
in the dimeric enzyme, the dynamic catalytic machinery can be 
trapped in the collapsed conformations by a variety of un- 
favorable factors. However, the collapsed enzyme still has the 
potential to be reactivated once the unfavorable factors are 
removed. By contrast, it appears that if dimerization is lacking, 
the catalytic machinery will be permanently frozen in the col- 
lapsed state, thus leading to an irreversible inactivation of the 
enzyme as observed on the R298A mutant, which has no cat- 
alytic activity even at a very high concentration for both pre- 
vious constructs (25) as well as the authentic sequence studied 
here. During a revision of the manuscript, a very unique crystal 
structure was reported for the avian infectious bronchitis virus 
3CLpro (30). In each asymmetric unit, three protease mole- 
cules, named A, B, and C, exist. While molecules A and B form 
a typical catalytically active and symmetrical homodimer like 
the dimeric SARS-CoV protease, molecule C is not involved in 
such a dimer. Molecule C was thus proposed to be a monomer 
frozen in crystal by the binding and fixation of its C terminus 
into the active site of the dimer formed by molecules A and B. 
Very interestingly, similar to our present observation for the 
R298A mutant, this trapped IBV 3CLpro monomer (molecule 
C) also shows significant differences from molecules A and B 
over termini and the loop region connecting domains IT and 
III, as well as the active-site pocket. In particular, the short 
3, -helix that we found to be characteristic of the collapsed 
enzymes is also presente in molecule C but not in molecules A 
and B. 

Compared with the collapsed protomers in the dimeric 
structures, the monomeric R298A structure also has profound 
structural changes over other regions critical for catalysis. For 
instance, very large changes are also found over residues 180 to 
200 forming the S2 subsite. Therefore, to comprehensively 
assess the features of the catalytic machinery, we quantitatively 
calculated the volumes of the active-site cavity of the R298A 
and other SARS-CoV 3CLpro structures by using the program 
SURFNET. Interestingly, for the dimeric structure (Protein 
Data Bank code 1UJ1) composed of two asymmetric pro- 
tomers, the active protomer without the characteristic 3,,.-helx 
has a large and deep cavity over the active site similar to that 
of the fully active enzyme (Protein Data Bank code 2H2Z). By 
contrast, the collapsed protomer with the characteristic 3, - 
helix has an active-site cavity with a significantly reduced vol- 
ume. More dramatically, the active-site cavity of the R298A 
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structure is completely fragmentized (Fig. 5C). As a conse- 
quence, the catalytic machinery of the monomeric R298A mu- 
tant no longer suits the binding substrates and leaves no space 
to accommodate glutamine at the substrate analog P1 site and 
a tetrahedral intermediate. 

Conclusion. The present study reveals that despite its loca- 
tion at the extra domain, Arg298 is a key component of an 
integrated network modulating dimerization. By use of over- 
lapped residues such as Tyr126, dimerization may be elegantly 
coupled to catalysis of the SARS-CoV 3CLpro. As a conse- 
quence, the mutation of Arg298 not only eliminates dimeriza- 
tion but also irreversibly inactivates the enzyme by perma- 
nently freezing the catalytic machinery in the collapsed state, 
characteristic of a formation of a short 3, -helix from a cha- 
meleon active-site loop. The current results not only lead to 
the proposal of a possible mechanism of rationalizing the ob- 
servation that the catalytic machinery of the SARS-CoV 
3CLpro is under extensive regulation by the unique extra do- 
main but also bear implications for understanding general 
principles governing the evolution of the regulatory device for 
the catalytic machinery by introducing oligomerization. 
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